AITopics | Locomotion

Collaborating Authors

Locomotion

News Overviews Instructional Materials AI-Alerts Classics

Meta-Reinforcement Learning of Structured Exploration Strategies

Abhishek Gupta, Russell Mendonca, YuXuan Liu, Pieter Abbeel, Sergey Levine

Neural Information Processing SystemsMay-26-2025, 13:09:02 GMT

Exploration is a fundamental challenge in reinforcement learning (RL). Many current exploration methods for deep RL use task-agnostic objectives, such as information gain or bonuses based on state visitation. However, many practical applications of RL involve learning more than a single task, and prior tasks can be used to inform how exploration should be performed in new tasks. In this work, we study how prior tasks can inform an agent about how to explore effectively in new situations. We introduce a novel gradient-based fast adaptation algorithm - model agnostic exploration with structured noise (MAESN) - to learn exploration strategies from prior experience. The prior experience is used both to initialize a policy and to acquire a latent exploration space that can inject structured stochasticity into a policy, producing exploration strategies that are informed by prior knowledge and are more effective than random action-space noise. We show that MAESN is more effective at learning exploration strategies when compared to prior meta-RL methods, RL without learned exploration strategies, and task-agnostic exploration methods. We evaluate our method on a variety of simulated tasks: locomotion with a wheeled robot, locomotion with a quadrupedal walker, and object manipulation.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
Europe > Austria > Vienna (0.14)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Meta-Reinforcement Learning of Structured Exploration Strategies

Abhishek Gupta, Russell Mendonca, YuXuan Liu, Pieter Abbeel, Sergey Levine

Neural Information Processing SystemsMay-24-2025, 02:33:23 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
Europe > Austria > Vienna (0.14)

Industry: Energy > Oil & Gas > Upstream (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.69)

Add feedback

Meet the 3D-printed robot that walks without electronics

MashableApr-10-2025, 13:18:53 GMT

Researchers at Bioinspired Robotics and Design Lab, UC San Diego created a fully 3D-printed, six-legged robot that walks using compressed air. It has no electronics, motors, or batteries--just soft actuators powered by gas. Tested on various terrains, it operates continuously with a steady air supply.

3d-printed robot, artificial intelligence, electronics

Mashable

Country: North America > United States > California > San Diego County > San Diego (0.44)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.90)

Add feedback

AcL: Action Learner for Fault-Tolerant Quadruped Locomotion Control

Xu, Tianyu, Cheng, Yaoyu, Shen, Pinxi, Zhao, Lin

arXiv.org Artificial IntelligenceMar-28-2025

-- Quadrupedal robots can learn versatile locomotion skills but remain vulnerable when one or more joints lose power . In contrast, dogs and cats can adopt limping gaits when injured, demonstrating their remarkable ability to adapt to physical conditions. Inspired by such adaptability, this paper presents Action Learner (AcL), a novel teacher-student reinforcement learning framework that enables quadrupeds to autonomously adapt their gait for stable walking under multiple joint faults. Unlike conventional teacher-student approaches that enforce strict imitation, AcL leverages teacher policies to generate style rewards, guiding the student policy without requiring precise replication. We train multiple teacher policies, each corresponding to a different fault condition, and subsequently distill them into a single student policy with an encoder-decoder architecture. While prior works primarily address single-joint faults, AcL enables quadrupeds to walk with up to four faulty joints across one or two legs, autonomously switching between different limping gaits when faults occur . Quadruped robots are gaining popularity as versatile mobile platforms capable of navigating diverse terrains and performing robust locomotion tasks such as search and rescue operations in buildings, cargo delivery in cities, and planetary exploration. In such scenarios, quadrupeds may encounter faults that cannot be immediately repaired, requiring them to continue their tasks despite the malfunction.

artificial intelligence, machine learning, teacher policy, (18 more...)

arXiv.org Artificial Intelligence

2503.21401

Country: Asia (0.29)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.89)

Add feedback

Fancy humanoid robot no longer walks like it urgently needs a toilet

Human-looking bipedal robots can already run, jump, breakdance, punch, and generally perform broad feats of athletic prowess most humans could only dream of. One thing they are still pretty bad at though is walking a straight line without looking like they are moments away from soiling themselves. Figure AI, one of the buzziest startups in the humanoid robot space, now says it has engineered a solution to help address their machine's stiff shuffle-step. The more natural-looking stride was achieved by analyzing thousands of virtual humanoid robots walking simultaneously in a simulated digital environment, Figure explained in a recent blog post. The company used reinforcement learning, rewarding the virtual robots for actions like synchronized arm swings, heel strikes, and toe-offs (when the toe leaves the ground) that more closely resemble human movement.

artificial intelligence, humanoid robot no longer walk, robot, (4 more...)

Popular Science

Country: North America > United States (0.17)

Genre: Research Report (0.35)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.93)

Add feedback

TAR: Teacher-Aligned Representations via Contrastive Learning for Quadrupedal Locomotion

Mousa, Amr, Karavis, Neil, Caprio, Michele, Pan, Wei, Allmendinger, Richard

arXiv.org Artificial IntelligenceMar-26-2025

-- Quadrupedal locomotion via Reinforcement Learning (RL) is commonly addressed using the teacher-student paradigm, where a privileged teacher guides a proprioceptive student policy. However, key challenges such as representation misalignment between privileged teacher and proprioceptive-only student, covariate shift due to behavioral cloning, and lack of deployable adaption; lead to poor generalization in real-world scenarios. We propose T eacher-Aligned Representations via Contrastive Learning (T AR), a framework that leverages privileged information with self-supervised contrastive learning to bridge this gap. By aligning representations to a privileged teacher in simulation via contrastive objectives, our student policy learns structured latent spaces and exhibits robust generalization to Out-of-Distribution (OOD) scenarios, surpassing the fully privileged "T eacher". Results showed accelerated training by 2 compared to state-of-the-art baselines to achieve peak performance. OOD scenarios showed better generalization by 40% on average compared to existing methods. Open-source code and videos are available at https://ammousa.github.io/TARLoco/.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2503.20839

Country: North America > United States (0.25)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Behavioral Conflict Avoidance Between Humans and Quadruped Robots in Shared Environments

Wei, Shuang, Zhang, Muhua, Gan, Yun, Huang, Deqing, Ma, Lei, Yang, Chenguang

arXiv.org Artificial IntelligenceMar-21-2025

Nowadays, robots are increasingly operated in environments shared with humans, where conflicts between human and robot behaviors may compromise safety. This paper presents a proactive behavioral conflict avoidance framework based on the principle of adaptation to trends for quadruped robots that not only ensures the robot's safety but also minimizes interference with human activities. It can proactively avoid potential conflicts with approaching humans or other dynamic objects, whether the robot is stationary or in motion, then swiftly resume its tasks once the conflict subsides. An enhanced approach is proposed to achieve precise human detection and tracking on vibratory robot platform equipped with low-cost hybrid solid-state LiDAR. When potential conflict detected, the robot selects an avoidance point and executes an evasion maneuver before resuming its task. This approach contrasts with conventional methods that remain goal-driven, often resulting in aggressive behaviors, such as forcibly bypassing obstacles and causing conflicts or becoming stuck in deadlock scenarios. The selection of avoidance points is achieved by integrating static and dynamic obstacle to generate a potential field map. The robot then searches for feasible regions within this map and determines the optimal avoidance point using an evaluation function. Experimental results demonstrate that the framework significantly reduces interference with human activities, enhances the safety of both robots and persons.

artificial intelligence, obstacle, robot, (16 more...)

arXiv.org Artificial Intelligence

2503.17014

Country:

Asia > Middle East > UAE (0.14)
Asia > China > Sichuan Province (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.86)

Add feedback

Autonomous Exploration-Based Precise Mapping for Mobile Robots through Stepwise and Consistent Motions

Zhang, Muhua, Ma, Lei, Wu, Ying, Shen, Kai, Sun, Yongkui, Leung, Henry

arXiv.org Artificial IntelligenceMar-21-2025

This paper presents an autonomous exploration framework. It is designed for indoor ground mobile robots that utilize laser Simultaneous Localization and Mapping (SLAM), ensuring process completeness and precise mapping results. For frontier search, the local-global sampling architecture based on multiple Rapidly Exploring Random Trees (RRTs) is employed. Traversability checks during RRT expansion and global RRT pruning upon map updates eliminate unreachable frontiers, reducing potential collisions and deadlocks. Adaptive sampling density adjustments, informed by obstacle distribution, enhance exploration coverage potential. For frontier point navigation, a stepwise consistent motion strategy is adopted, wherein the robot strictly drives straight on approximately equidistant line segments in the polyline path and rotates in place at segment junctions. This simplified, decoupled motion pattern improves scan-matching stability and mitigates map drift. For process control, the framework serializes frontier point selection and navigation, avoiding oscillation caused by frequent goal changes in conventional parallelized processes. The waypoint retracing mechanism is introduced to generate repeated observations, triggering loop closure detection and backend optimization in graph-based SLAM, thereby improving map consistency and precision. Experiments in both simulation and real-world scenarios validate the effectiveness of the framework. It achieves improved mapping coverage and precision in more challenging environments compared to baseline 2D exploration algorithms. It also shows robustness in supporting resource-constrained robot platforms and maintaining mapping consistency across various LiDAR field-of-view (FoV) configurations.

artificial intelligence, exploration, frontier point, (14 more...)

arXiv.org Artificial Intelligence

2503.17005

Country:

Europe (0.93)
Asia > China (0.47)
North America > United States (0.46)
(2 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.61)

Add feedback

Transferable Latent-to-Latent Locomotion Policy for Efficient and Versatile Motion Control of Diverse Legged Robots

Zheng, Ziang, Zhan, Guojian, Shuai, Bin, Qin, Shengtao, Li, Jiangtao, Zhang, Tao, Li, Shengbo Eben

arXiv.org Artificial IntelligenceMar-21-2025

Reinforcement learning (RL) has demonstrated remarkable capability in acquiring robot skills, but learning each new skill still requires substantial data collection for training. The pretrain-and-finetune paradigm offers a promising approach for efficiently adapting to new robot entities and tasks. Inspired by the idea that acquired knowledge can accelerate learning new tasks with the same robot and help a new robot master a trained task, we propose a latent training framework where a transferable latent-to-latent locomotion policy is pretrained alongside diverse task-specific observation encoders and action decoders. This policy in latent space processes encoded latent observations to generate latent actions to be decoded, with the potential to learn general abstract motion skills. To retain essential information for decision-making and control, we introduce a diffusion recovery module that minimizes information reconstruction loss during pretrain stage. During fine-tune stage, the pretrained latent-to-latent locomotion policy remains fixed, while only the lightweight task-specific encoder and decoder are optimized for efficient adaptation. Our method allows a robot to leverage its own prior experience across different tasks as well as the experience of other morphologically diverse robots to accelerate adaptation. We validate our approach through extensive simulations and real-world experiments, demonstrating that the pretrained latent-to-latent locomotion policy effectively generalizes to new robot entities and tasks with improved efficiency.

artificial intelligence, recovery module, robot, (15 more...)

arXiv.org Artificial Intelligence

2503.17626

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)

Add feedback

Filters

Collaborating Authors

Locomotion

10fb6cfa4c990d2bad5ddef4f70e8ba2-Paper.pdf

Meta-Reinforcement Learning of Structured Exploration Strategies

Meta-Reinforcement Learning of Structured Exploration Strategies

Meet the 3D-printed robot that walks without electronics

AcL: Action Learner for Fault-Tolerant Quadruped Locomotion Control

Fancy humanoid robot no longer walks like it urgently needs a toilet

TAR: Teacher-Aligned Representations via Contrastive Learning for Quadrupedal Locomotion

Behavioral Conflict Avoidance Between Humans and Quadruped Robots in Shared Environments

Autonomous Exploration-Based Precise Mapping for Mobile Robots through Stepwise and Consistent Motions

Transferable Latent-to-Latent Locomotion Policy for Efficient and Versatile Motion Control of Diverse Legged Robots